Understanding costs
Your task
This investigation looks into whether variations in property attributes may account for variations in carbon emissions between English and Welsh local authorities. The purpose is not to anticipate emissions for individual properties, but rather to investigate how much of the variation across authorities persists after accounting for housing stock differences, specifically if older housing stock or bigger residences cause greater emissions.
We employ a mixed-effects model since we have several measures (house “types”) inside each local authority. The fixed effects reflect systematic variances relating to age and room count, whereas the random effects measure how much each authority deviates from the national norm.
The Problem
The Energy Performance Certificate (EPC) data provides precise measurements of building energy parameters, including:
Energy efficiency rating (A-G)
Estimated yearly CO₂ emissions
Expenses for heating, lighting, and hot water
Suggestions for improvement
The housing stock varies per local authority. Some regions have a high concentration of pre-1930 properties, which suffer from inadequate insulation and outdated building practices. Others have larger homes that often demand more heating energy.
We were requested to compare authorities fairly, while allowing for:
- Whether properties are pre-1930 or post-1930.
- the number of heated rooms (1–10).
This naturally leads to the mixed model shown below:
\[Y_{ji} = \beta_0 + \beta_1 x_{j}^{(\text{old})} + \sum_{r=1}^{10} \beta_r x_{j}^{(r)} + \gamma_j\]
Where
- \(Y_{ji}\) is the average carbon dioxide emissions for authority \(j\) for each house type \(i\) (old/new, 1, 2, …, 10 rooms)
- \(x_{j}^{(old)}\) is the number of houses that are older than 1930 in authority \(j\)
- \(x_{j}^r\) are the number of houses with \(r=1, 2, ..., 10\) rooms in authority \(j\)
- \(\beta_0\), \(\beta_1\), \(\beta_r\) are the fixed effects
- \(\gamma_j\) is the random effect for local authority \(j\)
The Data
Some initial exploratory data analysis shows us how the carbon emissions vary by age of property and by the number of rooms:
{width="100%" height="600px"} {width="100%" height="600px"} The plots indicate: - **Pre-1930 homes produce much more CO₂**, indicating older building quality. - **Emissions increase progressively with each new room**, as larger homes require more energy. These patterns warrant incorporating both variables in the fixed-effects model. ## The model A model can be fitted and a summary of the fixed effects produced ::: {#8b4adea9 .cell results='asis' execution_count=1} ::: {.cell-output .cell-output-stdout}Mixed Linear Model Regression Results
Model: MixedLM Dependent Variable: shortfall
No. Observations: 6917 Method: REML
No. Groups: 346 Scale: 3.5884
Min. group size: 17 Log-Likelihood: -14453.0511 Max. group size: 20 Converged: Yes
Mean group size: 20.0
——————————————————— Coef. Std.Err. z P>|z| [0.025 0.975] ——————————————————— Intercept 2.409 0.065 37.206 0.000 2.282 2.536 age[T.Recent] -2.981 0.046 -65.447 0.000 -3.071 -2.892 n_rooms 1.054 0.008 132.940 0.000 1.039 1.070 Group Var 0.434 0.025
=========================================================
``` ::: :::
Below is the summary of the fitted mixed-effects model : Model: MixedLM Dependent Variable: shortfall No. Observations: 6917 Method: REML No. Groups: 346 Scale: 3.5884 Min. group size: 17 Log-Likelihood: -14453.0511 Max. group size: 20 Converged: Yes Mean group size: 20.0 Intercept - 2.409 0.065 37.206 0.000 2.282 2.536 age[T.Recent] - -2.981 0.046 -65.447 0.000 -3.071 -2.892 n_rooms - 1.054 0.008 132.940 0.000 1.039 1.070 Group Var - 0.434 0.025
Fixed effects interpretation
Intercept (β₀ = 2.409)
This is the anticipated amount of emissions for: - a pre-1930 property
- with 1 heated room
- in an “average” local authority
Property age’s impact (β₁ = –2.981)
The post-1930 and pre-1930 residences are compared using the coefficient for age[T.Recent].
-On average, homes built after 1930 release 2.981 less units of CO2.
This impact (|z| > 65) is large and extremely significant.
It demonstrates that newer buildings are significantly more carbon-efficient.
### The impact of the number of rooms (β₂ = 1.054)
Emissions rise with each extra heated room by:
- On average, +1.054 units of CO2.
Once more, this is a fairly substantial influence (z ≈ 133), demonstrating that one of the main causes of emissions is **property size.
Random effects (0.434 group variance)
After accounting for the following, the random effect shows how much each local authority deviates from the national average.
-The allocation of property ages
-The quantity of rooms
A variance of 0.434 means:
-There are still actual disparities amongst authorities.
-Some consistently perform better or worse than the national average.
-These variations cannot be entirely explained by the features of housing alone.
Examining the caterpillar plot is motivated by this.
The key insight
We’ve been asked to rate local authority performance, so the most important result is a ``caterpillar plot’’ of the random effects.
How to understand the plot
Left-leaning authorities have negative random effects -Their housing stock causes them to release less CO2 than expected**
-Their performance is better than average.Authorities on the right have positive random effects -They perform worse than average -They release more CO₂ than expected.
-Each estimate’s statistical accuracy is displayed via the confidence intervals.
What this implies
Even after accounting for:
-The quantity of houses built before 1930 -The average size of buildings
Certain authorities routinely attain reduced emissions, indicating:
-Improved insulation programs -More advanced heating systems
-Local policy changes -Sociodemographic variations that the model does not account for.
On the other hand, underperforming authorities could need specific assistance or further research.
Record of AI use for MTHM503 supervised coursework
Instructions: You can use this document to record when, how and why you used GenAI to complete your assessment. It will help you create a record of AI use to submit alongside your references for AI-integrated and AI-assisted assignments. It may also be useful to help you discuss your AI use if you are required to do so in an academic conduct meeting.
| Date | AI tool used | Purpose | Prompt | Hyperlink to output (where possible) | Section of work used for |
|---|---|---|---|---|---|
| 11/12/2025 | ChatGPT | To improve structure and academic tone | “Can you help me improve the structure and check the academic tone?” | N/A | Multiple Sections |
| 10/12/2025 | ChatGPT | To improve grammar and academic wording | “Check the paragraph for any grammatical errors and see if the words are suitable for an academic tone” | N/A | Multiple Sections |